This problem set corresponds to the central composite design (CCD) section of the advanced designs chapter in the course https://doe.dscoe.org. For this problem set, we will develop requirements for a hypothetical future ground combat vehicle. The solutions to the problem set use the packages shown below; however, you may use whatever packages you prefer.
library(tidyverse)
library(rsm)
library(GGally)
library(readr)
library(bestglm)
library(plotly)
Generate a CCD that consists of four factors at three levels each with the following names and ranges of values. We will eventually upload the design matrix to a Shiny app to obtain simulation results, so please ensure the column names match those below.
| Factor | Low | High | Remarks |
|---|---|---|---|
| Gun | 1.0 | 3.0 | Main gun range (km) |
| ATGM | 1.0 | 8.0 | Anti-tank guided missile range (km) |
| Sensor | 1.0 | 10.0 | Sensor range (km) |
| APS | 0.0 | 1.0 | Level of protection provided by the active protection system (0.0 = no protection) |
Display the resulting design matrix.
ccd = cube(4, n0=0,
coding = list(x1~(Gun-2)/1, x2~(ATGM-4.5)/3.5, x3~(Sensor-5.5)/4.5, x4~(APS-0.5)/0.5))
ccd = djoin(ccd, star(alpha=1, n0=1))
ccd = decode.data(ccd)
ccd %>% select(Gun, ATGM, Sensor, APS)
## Gun ATGM Sensor APS
## 1 1 1.0 1.0 1.0
## 2 3 1.0 10.0 0.0
## 3 3 8.0 1.0 1.0
## 4 1 1.0 1.0 0.0
## 5 1 1.0 10.0 1.0
## 6 1 1.0 10.0 0.0
## 7 3 8.0 10.0 1.0
## 8 1 8.0 10.0 1.0
## 9 1 8.0 10.0 0.0
## 10 1 8.0 1.0 1.0
## 11 3 1.0 10.0 1.0
## 12 3 1.0 1.0 1.0
## 13 3 8.0 1.0 0.0
## 14 1 8.0 1.0 0.0
## 15 3 8.0 10.0 0.0
## 16 3 1.0 1.0 0.0
## 17 2 4.5 10.0 0.5
## 18 2 4.5 1.0 0.5
## 19 2 4.5 5.5 0.0
## 20 3 4.5 5.5 0.5
## 21 1 4.5 5.5 0.5
## 22 2 8.0 5.5 0.5
## 23 2 4.5 5.5 1.0
## 24 2 1.0 5.5 0.5
## 25 2 4.5 5.5 0.5
Produce a pairs plot of the four factors.
pairs(ccd %>% select(Gun, ATGM, Sensor, APS))
How many design points does your design matrix have? How many design points does the equivalent gridded design have?
# The CCD has 25 design points.
# The gridded design has 3^4 = 81 design points.
If one of our factors is sensor = as.factor(c('2G', '3G', '4G')), should we include it in a CCD? Why or why not?
# No, because sensor is a categorical variable and 3G is not half way between 2G and 4G.
Save your CCD design matrix as a .csv file. Then, upload it to the Combat Simulator Shiny app to run a simulation for each of your design points. Navigate to the tab titled “Central Composite Designs Problem Set”, upload your .csv file, and then download the results. Next, read the data into R, and produce a pairs plot with a smoother.
# create the .csv file
# readr::write_csv(ccd %>% select(Gun, ATGM, Sensor, APS), "ccd.csv")
# do the simulation
# read the results back in
omfv = read.csv("ccd_out.csv", sep=',', header=TRUE)
smooth_fn <- function(data, mapping, ...){
ggplot(data = data, mapping = mapping) +
geom_point() +
geom_smooth(formula = y~x, method=loess, fill="red", color="red", se=FALSE, ...)
}
ggpairs(omfv, lower=list(continuous=smooth_fn), progress=FALSE) +
ggtitle("CCD Results") +
theme_bw()
Check for potential two-way interactions and first-order effects. Then, using the model selection method of your choice, fit, summarize, and interpret your recommended linear model.
# check two-way interactions
summary(lm(kills ~ .^2, data=omfv))
##
## Call:
## lm(formula = kills ~ .^2, data = omfv)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.0667 -2.1567 -1.5592 0.1694 6.5289
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.61744 5.90679 1.120 0.281
## gun 2.70904 2.41661 1.121 0.281
## atgm -0.25332 0.83601 -0.303 0.766
## sensor -0.56342 0.65721 -0.857 0.406
## aps 4.85017 6.10544 0.794 0.440
## gun:atgm -0.15328 0.30768 -0.498 0.626
## gun:sensor -0.01777 0.23931 -0.074 0.942
## gun:aps -0.14594 2.15379 -0.068 0.947
## atgm:sensor 0.54810 0.06837 8.016 1.34e-06 ***
## atgm:aps 0.13929 0.61537 0.226 0.824
## sensor:aps 0.02599 0.47862 0.054 0.957
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.308 on 14 degrees of freedom
## Multiple R-squared: 0.9402, Adjusted R-squared: 0.8974
## F-statistic: 22 on 10 and 14 DF, p-value: 7.314e-07
# atgm:sensor appears significant
# check main effects
summary(lm(kills ~ ., data=omfv))
##
## Call:
## lm(formula = kills ~ ., data = omfv)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.043 -9.605 5.005 6.360 8.322
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.6119 5.9924 -0.937 0.360181
## gun 1.8486 2.0124 0.919 0.369247
## atgm 2.5243 0.5750 4.390 0.000283 ***
## sensor 1.8805 0.4472 4.205 0.000436 ***
## aps 5.3280 4.0248 1.324 0.200501
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.538 on 20 degrees of freedom
## Multiple R-squared: 0.6642, Adjusted R-squared: 0.597
## F-statistic: 9.888 on 4 and 20 DF, p-value: 0.0001395
# see what we get using AIC for model selection
bestglm(omfv, IC="AIC")
## AIC
## BICq equivalent for q in (0.645735531288494, 0.748919916833536)
## Best Model:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.914720 4.4230021 -0.4329007 0.6695005252
## atgm 2.524299 0.5728269 4.4067404 0.0002456783
## sensor 1.880475 0.4455320 4.2207410 0.0003832747
## aps 5.328012 4.0097883 1.3287515 0.1981885607
# AIC results don't include gun.
# There are only four factors, so manual selection of
# factors isn't unreasonable.
# all factors except gun plus the two-way interaction
omfv.lm = lm(kills ~ atgm + sensor + aps + atgm:sensor, data = omfv)
summary(omfv.lm)
##
## Call:
## lm(formula = kills ~ atgm + sensor + aps + atgm:sensor, data = omfv)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.6190 -2.8208 -0.4591 1.5386 8.3775
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.65067 2.63551 4.421 0.000263 ***
## atgm -0.49023 0.44588 -1.099 0.284627
## sensor -0.58596 0.35820 -1.636 0.117513
## aps 5.32801 1.90637 2.795 0.011184 *
## atgm:sensor 0.54810 0.06419 8.539 4.2e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.044 on 20 degrees of freedom
## Multiple R-squared: 0.9247, Adjusted R-squared: 0.9096
## F-statistic: 61.36 on 4 and 20 DF, p-value: 6.041e-11
# looks like everything's significant
# a negative coefficent for atgm is counterintuitive
Check model assumptions and the need for transformations, and make any adjustments necessary to improve the model.
plot(omfv.lm)
# we almost reject the normality assumption. consider transformation
shapiro.test(residuals(omfv.lm))
##
## Shapiro-Wilk normality test
##
## data: residuals(omfv.lm)
## W = 0.9228, p-value = 0.05934
# we fail to reject homoscedasticity
car::ncvTest(omfv.lm)
## Non-constant Variance Score Test
## Variance formula: ~ fitted.values
## Chisquare = 3.127737e-05, Df = 1, p = 0.99554
# no outliers
# check for non-linear predictors (termplot can accept interaction terms)
termplot(lm(kills~., data=omfv), partial.resid = TRUE, col.res='blue', se=T)
# no strong evidence for a need to transform
# we either need to accept the violation of normally distributed residuals
# or us non-parametric techniques (a later chapter)
# final model
summary(omfv.lm)
##
## Call:
## lm(formula = kills ~ atgm + sensor + aps + atgm:sensor, data = omfv)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.6190 -2.8208 -0.4591 1.5386 8.3775
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.65067 2.63551 4.421 0.000263 ***
## atgm -0.49023 0.44588 -1.099 0.284627
## sensor -0.58596 0.35820 -1.636 0.117513
## aps 5.32801 1.90637 2.795 0.011184 *
## atgm:sensor 0.54810 0.06419 8.539 4.2e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.044 on 20 degrees of freedom
## Multiple R-squared: 0.9247, Adjusted R-squared: 0.9096
## F-statistic: 61.36 on 4 and 20 DF, p-value: 6.041e-11
# interpretation
# the baseline vehicle gets 11.7 kills on average
# gun is insignificant
# for every 1 km increase in atgm range, kills decrease by 0.5
# for every 1 km increase in sensor range, kills decrease by 0.6
# upgrading the aps from no protection to full protection, increases kills by 5.3
# for every 1 km increase in atgm range, if sensor range is also increased, kills increase by 0.5
If you identified an interaction between two factors, plot the response surface for those two factors. Otherwise, plot the response surface for any two of your final model’s significant factors. Interpret the plot.
killz = as.matrix(omfv %>%
group_by(atgm, sensor) %>%
summarize(meanKills = mean(kills)) %>%
pivot_wider(names_from = atgm, values_from = meanKills))
## `summarise()` has grouped output by 'atgm'. You can override using the `.groups` argument.
plot_ly() %>%
add_surface(x=~colnames(killz)[2:4], y=~killz[, 1], z=~killz) %>%
layout(
title = "ATGM Sensor Response Surface",
scene = list(
xaxis = list(title = "ATGM"),
yaxis = list(title = "Sensor"),
zaxis = list(title = "Kills"),
camera = list(eye = list(x = -1.25, y = -1.25, z = 1.25))
))
# the plot shows the dependency between sensor range and atgm range
# basically, increasing the sensor range beyond the atgm provides
# little additional benefit. Keep in minds with this plot, we are
# still accounting for the contribution of the other factors, so
# the kills shown isn't strictly the result of only the atgm and sensor.